Data compression

ABSTRACT

A method of compressing an image frame composed of an array of pixels in the form of digital electrical signals, the method comprising the steps of: (a) providing a reference image as a first approximation to the original image to be compressed; (b) dividing the original image into patches of one or more sizes; (c) for each patch determining a compressed encoding of the data contained therein where the compressed encoding can be uncompressed to provide an approximation to the patch; (d) selecting that one of the compressed encodings which, when uncompressed and added to the reference image, gives the biggest improvement therein relative to the original image; (e) adding the selected compressed encoding in uncompressed form to the reference image and in compressed form to a compressed representation of the original image; and (f) recursively repeating step (d) for the remaining compressed encodings until either a desired quality level of the reference image or a maximum data size of the compressed representation is achieved.

The present invention relates to a method and apparatus for compressingarrays of data in the form of digital electrical signals and isapplicable in particular, though not necessarily, to the compression ofdigitally encoded image sequences.

There have recently been proposed a number of techniques for compressingarrays of data and in particular two dimensional images which may forexample be individual images or frames of a video sequence. Some of theimage compression techniques employ what is known as "vectorquantisation" where a codebook of reference patches (e.g. relativelysmall portions taken from one or more "library" images) is constructed.An image to be compressed is partitioned into a number of image patchesand a matching (i.e. similar) reference patch selected for each imagepatch from the codebook. The codebook index for each chosen referencepatch in the codebook is stored, together with the correspondingposition vectors of the image patches (i.e. their position in theoriginal image), to provide a compressed representation of the image.This coding for each patch is referred to as a `compressed encoding`.Providing that a copy of the codebook is available, an approximation tothe original image can be constructed by using the stored codebookindices to recover the required set of reference patches and insertingthese into an image frame using the respective stored image patchposition vectors. The achievable degree of compression is a function ofthe size of the image patches into which the image is partitioned,larger patches allowing higher compression.

It is recognised that for most images, certain areas of an image willcontain more detail than other areas and that, if the patches into whichthe image is divided are of a size intended to achieve high compression,the detailed areas of the image may not be adequately represented in thecompressed representation. It has therefore been proposed that areas ofan image containing significant detail should be represented byrelatively small patches whilst areas containing relatively little or nodetail should be represented by larger patches. This process involvestemporarily reconstructing an image using the first level large patchesidentified from the codebook. This provides a reference image which canbe improved in a stepwise manner. It will be appreciated that thistemporary image represents a decompressed version of the `compressed`image. In order to identify areas of detail which require representationby smaller patches, after each new large reference patch is added to thecompressed image, the corresponding temporary image is compared againstthe original image to identify the region of the original image whichmaximally differs from the temporary image. The image patch containingthis region is then sub-divided into smaller image patches and referencepatches are identified from the codebook for these newly created imagepatches. Codebook indices and position vectors for the identifiedreference patches are then added to the compressed representation. Theprocess is recursively repeated until either the temporary image reachesa desired quality level or the data size of the compressedrepresentation reaches some maximum threshold level.

A problem with this maximum difference approach is that because thecodebook is necessarily of a finite size, in certain circumstances itmay be extremely difficult to find a close match for certain regions ofan image. A considerable amount of processing time, and storage space inthe compressed image, may be devoted to representing regions of theoriginal image which, although identified as being poorly represented bythe compressed image, can never be satisfactorily represented due to thelimitations of the codebook.

It is an object of the present invention to overcome or mitigate atleast one of the disadvantages of such known data compression processes.

It is another object of the invention to provide a data compressionmethod and apparatus which optimises the quality of the compressed data

According to a first aspect of the present invention there is provided amethod of compressing an array of data entries in the form of digitalelectrical signals, the method comprising the steps of:

(a) providing a reference data array as a first approximation to theoriginal data array to be compressed;

(b) dividing the original data array into blocks of one or more sizes;

(c) for each data block, determining a compressed encoding of the datacontained therein where the compressed encoding can be uncompressed toprovide an approximation to the data block;

(d) selecting that one of the compressed encodings which, whenuncompressed and added to the reference data array, gives the biggestimprovement therein relative to the original data array;

(e) adding the selected compressed encoding in uncompressed form to thereference data array and in compressed form to a compressedrepresentation of the original data array; and

(f) recursively repeating step (d) for the remaining compressedencodings until either a desired quality level of the reference dataarray or a maximum data size of the compressed representation isachieved.

The present invention `concentrates` on improving those areas of thereference data array for which improvement is both possible and mostadvantageous. This contrasts with known compression processes whichoften concentrate on areas for which improvement is not possible or canonly be achieved to a marginal extent.

The step (c) of determining a compressed encoding of each data block maycomprise searching a codebook of reference data blocks, where eachreference block is assigned a unique codebook index, to find the closestmatching reference block. Alternatively, this step may involvedetermining some other appropriate representation such as a discretecosine transform (DCT) or discrete fourier transform (DFT) for eachblocks.

The compressed encoding selected in step (d) which, when in uncompressedform, results in the greatest improvement in the reference data array,and selected in step (d), may be identified by: firstly determining theerror ε₁, between each of said data blocks and the respective compressedencodings in uncompressed form; determining the error ε₂, between eachof said data blocks and the respective blocks of the reference dataarray; and selecting that compressed representation which maximises ε₁-ε₂. In one example, the errors ε₁ -ε₂ may be evaluated as the totalsquared difference between respective entries of the compared datablocks.

Whilst the invention can be applied to the compression of any digitaldata, for example representing audio signals or 3D pressure variationswithin the atmosphere, it is particularly applicable to the compressionof two dimensional images which comprise an array of pixels, each havinga pixel intensity value or values. Typically, the data blocks comprisepatches of adjacent pixels. Whilst the patches may be created byuniformly dividing the image on a single level, as an alternative, theimage may be arbitrarily divided on two or more levels to create severalsets of overlapping and/or differently sized or shaped patches. Indeed,the creation of overlapping patches may be particularly advantageous asit reduces or eliminates edge artifacts which can otherwise give a"tiled" effect to a decompressed image. The compressed representationsor reference patches derived for overlapping patches are combined in thereference image using a predetermined criterion. For example, theoverlapping regions may be derived from a weighted combination of theoverlapping pixel intensities. Alternatively, a rule may be provided forchoosing which pixels to choose from which patch, so that the pixels canbe combined in a non-overlapping manner.

Where an image is sub-divided into differently sized patches on two ormore levels, it is possible that a reference patch may be selected foraddition to the reference image which patch completely or partiallyoccludes a previously selected patch. Where the occlusion is complete,the previously selected patch may be removed from the reference image.However, where the occlusion is partial, it may be appropriate todetermine if the partially occluded patch should be replaced with analternative patch (or transformed in some way) to better approximatethat portion which is not occluded. Alternatively, the partiallyoccluded patch may be completely removed from the reference image, whenby so doing the quality of the image may be improved or the reduction inquality is more than compensated for by an increase in quality obtainedby insertion into the reference image of some reference patch in anotherregion of the image frame.

This application of the invention can be extended to the compression andtransmission of video pictures (e.g. using CD or video tape media) andmore particularly to the compression and transmission of video picturesbetween videophones. In an embodiment of the invention, the referencedata array provided at step (a) is an arbitrary image such as auniformly black image. The reference image is stored in a memory of thetransmitting videophone and a memory of the receiving videophone.Compressed encodings of image patches obtained from the first imageframe to be captured by the phone's camera are added recursively to abuffer memory and simultaneously in uncompressed form to the referenceimage in the transmitting phone's memory until a predetermined amount ofdata is held in the buffer memory. The data stored in the buffer is thentransmitted to the receiving videophone.

At some time interval (the video camera `refresh rate` or a multiplethereof) after capturing the first image frame, the transmittingvideophone will capture a second image frame. If this time interval isrelatively short, e.g. 0.1 seconds, it is likely that this subsequentimage frame will be similar to the first frame and that only marginalchanges will therefore be required to update the display at thereceiving videophone. It would be extremely inefficient to completelydiscard the image data already transmitted to the receiving videophone,and start the compression process again. Rather, the initial referenceimage frame modified during compression of the first image frame byrepeated application of step (d) can form a new reference image framefor the second image frame. Following execution of steps (b) and (c),the repeated application of step (d) identifies those compressedencodings which in uncompressed form reduce the error between the newreference image frame and the second image frame. These compressedencodings are stored in the buffer memory until the required data levelis achieved whereupon they are transmitted to the receiving videophoneto update the image frame displayed thereon. It will be appreciated thatthose areas of the captured image frame which have changed will be mostsusceptible to improvement and that the compressed encodings identifiedin step (d) will tend to correspond to those areas.

In the case where the compressed encodings are obtained by searching acodebook of reference patches, the codebook can be constructed using anyone of a number of known techniques. The codebook preferably comprises anumber of sets of differently sized patches. For example, the codebookmay comprise a set of large patches (e.g. 64 by 64 pixels), a set ofintermediate patches (e.g. 16 by 16 pixels), and a set of small patches(e.g. 8 by 8 pixels) to match similarly sized patches into which theimage to be compressed is divided.

In order to improve the quality of the compressed representation,identification of matching patches from the codebook may involvetransforming the reference patches or the image patches using, forexample, rotation or reflection. The transformation required to improvethe match will be stored with the image patch position vectors and thecodebook indices as part of the compressed representation. Similarly,the image and reference patches may be normalised for brightness andcontrast with the resulting normalising constants forming part of thecompressed representation.

Where the image to be compressed is one of a sequence of video imageframes, and where scene components are in motion between subsequentframes, it may be that a good approximation to an image patch containingthe moving component or a portion thereof may be obtained by extractinga patch from an adjacent portion of the compressed representation (inuncompressed form) of the preceding frame. The compression method maytherefore comprise temporarily adding patches extracted from thecompressed representation of the preceding frame to the codebook, or toa supplementary codebook, prior to conducting a search of thecodebook(s). In order to reduce the search time, the search may berestricted to patches extracted from areas of the compressedrepresentation of the preceding frame which are adjacent to the imagepatch for which the search is being conducted.

It will be appreciated that an object, e.g. a person, moving about infront of a stationary, or near stationary, background, may becontinually covering and uncovering portions of the background. It maythus be appropriate to store in the supplementary codebook patchesobtained from a number of frames directly preceding the frame to becompressed. This principle can be extended further, such that a codebookis created for an auxiliary image prior to commencing a videophone`conversation`. This auxiliary image would typically be the backgroundin front of which the caller is to be situated. Of course, multipleauxiliary images can be used if appropriate.

A further development is to flag patches taken from the auxiliaryimage(s) so that they ran be recognised as such by the receivingvideophone. This flag could allow the decompression algorithm to divertthe codebook look-up operation to a `substitute` background codebookallowing the caller to appear in front of some alternative background,e.g. a palm fringed beach rather than the boardroom wall in front ofwhich he or she is actually situated.

According to a second aspect of the present invention there is providedapparatus for compressing an array of data entries in the form ofdigital electrical signals, the apparatus comprising a digital computer,for example of known type, set up to run a program embodying the methodset out according to the above first aspect.

For a better understanding of the present invention and in order to showhow the same may be carried into effect, an embodiment of the inventionwill be described by way of example with reference to the accompanyingdrawing which shows a division of part of an image frame to becompressed into multi-layer image patches.

Consider firstly a two dimensional image frame composed of a regulararray of pixels, each of which has associated therewith a pixelintensity. The first stage in the compression process is to create acodebook by extracting reference patches from one or more library imageframes. These library image frames preferably contain features which arelikely to be contained in the image frame to be compressed. A typicalcodebook would contain several hundred reference patches of differentsizes, e.g. a set of 32×32 pixel patches, a set of 16×16 pixel patches,and a set of 8×8 pixel patches, each of the patches being identified bya unique codebook index. Each of the reference patches are normalisedfor brightness by determining the average intensity of the pixels withinthe patch and subtracting that average value from the intensity valuefor each pixel. Similarly, the patches are normalised for contrast bydividing the intensity value of each pixel within a patch by the maximumabsolute intensity value within that patch following normalisation forbrightness. Thus, the pixel intensity value for each patch will liebetween -1 and +1.

An image frame to he compressed is sub-divided into an array ofcontiguous data blocks or image patches (each of which has a uniqueposition vector) at a first level. For an image comprising 512×512pixels, these patches will be of size 32×32 pixels so that the imageframe is divided into 256 large image patches. The image is thensubdivided a further two times, at different levels, to provide a set of16×16 pixel patches and a set of 8×8 pixel patches. The sub-division issuch that the 8×8 patches overlap the edges of the 16×16 patches and the16×16 patches overlap the 32×32 patches. This overlapping arrangement isillustrated in the accompanying drawing although one of the levels ofsub-division is omitted for the sake of clarity. Each image patch isassigned a position vector, which may conveniently be the position inthe image of the centre of the patch.

All of the image patches are normalised for both brightness and contrastas described above so that the pixel intensities lie between -1 and +1.For each normalised image patch, the correspondingly sized entries inthe codebook are searched to identify that one which most closelyresembles the image patch. For a black and white image, the search maybe carried out by comparing each pixel of the selected image patch withthe corresponding pixel of each reference patch, using for example amean squared difference approach (for colour images the intensities foreach of the colours is compared). Alternatively, and in order tominimise the amount of computation required, some "global"characteristic or characteristics may be calculated for each of thereference and image patches and the search conducted by comparing theseglobal characteristics. One example of this type of search strategy isdescribed in

WO95/20296.

The codebook index of each of the matched reference patches is stored,together with the position vector and the brightness and contrastnormalisation factors for the matched image patch, as a compressedencoding. For each codebook reference patch and image patch pair matchedtogether, the error between the two patches is evaluated to provide aset of errors ε₁. One suitable measure of error is the total squareddifference between the respective pixel arrays. This requires theintensities of the corresponding pixels to be subtracted from oneanother, and the resulting differences-squared and summed.

The compression method involves the provision of some arbitraryreference image frame which can be taken as a first, albeit extremelypoor, approximation to the image frame to be compressed. The above errorcomputation operation is carried out for the image patches andrespective correspondingly sized areas extracted from the referenceimage frame so that a set of actual errors ε₁ between the referenceimage frame and the frame to be compressed are obtained. This set oferrors is then compared with the set of errors obtained for thereference patches.

From the comparison, that codebook reference patch which would lead tothe greatest reduction in error, i.e. the greatest improvement in imagequality, if added to the reference image, is identified. Thecorresponding compressed encoding is stored in a compressed imagerepresentation. In addition, the reference image is updated by addingthe selected reference patch thereto.

This process is recursively repeated so that reference patches arecontinually added to the reference image frame leading to a stepwiseimprovement in the quality of the reference image frame. In parallel,the compressed encodings are added to the compressed representation. Itis noted that use of the total squared error to measure improvement willtend to add larger reference patches prior to smaller ones, althoughthis will not always be the case. Typically the recursion will continueuntil the quality of the reference image frame reaches some acceptablethreshold level, for example by ensuring that the error between thereference image frame and the original image frame at any one point doesnot exceed a threshold level. Alternatively, the process may continueuntil the amount of data contained in the compressed imagerepresentation reaches some threshold level.

It is envisaged that the image compression method described above couldbe used to compress `still` two dimensional image frames or to compressa set of image frames forming a video sequence. One application to whichthe method is particularly applicable is that of the transmission ofvideo images between videophones. Whilst the concept of videophones hasbeen around for a number of years, the commercial success of videophoneshas been limited in practice by the poor quality of the images displayedon the videophones. This poor quality results from the low signalbandwidth available over the transmission channels combined withinappropriate qualities, delays and frame rates available with currentcompression technique.

An improvement to existing transmission systems would be to compresseach frame to be transmitted using the compression method detailedabove. However, an even greater improvement can be achieved byrecognising that the changes between successive images of the videosequence are likely to be extremely small. Typically, the background tothe videophone operator will remain stationary whilst the operators headand facial features may vary to some small extent between frames.

The above method is applied to videophone signal transmission asfollows. The first frame captured by the transmitting videophone iscompressed as described above using some initial arbitrary referenceimage frame which is stored in memory. The same reference image frame isalso stored in a buffer memory of the receiving videophone. A set ofcompressed encodings are obtained as described above and are stored inbuffer memory up to a preset data limit, determined by the signalbandwidth of the transmission channel and desired frame rate of theimage sequence. This data is then transmitted from the buffer memory tothe receiving videophone where it is decoded and added to the referenceimage frame for display. The data is also added to the reference imagestored in the transmitting videophone's memory.

Whilst the transmitting videophone is compressing the first capturedimage frame, its camera captures a second image frame. Followingtransmission of the first set of patch data, the compression process isrepeated but this time looks for patches which will give the bestimprovement of the `new` reference image relative to the second capturedimage frame. The compressed encodings of these patches are again storedin the buffer memory prior to transmission to the receiving videophone.The process is repeated for each newly captured frame such that thereference image frame continually tracks changes in the captured imageframes.

It will be appreciated that modifications to the above describedembodiments may be made without departing from the scope of the presentinvention.

One such modification which may be made to the videophone implementationinvolves extracting patches at one or more levels from each referenceimage frame and adding these patches temporarily to a supplementarycodebook which is updated after transmission of each image update. Thisprocess would be conducted at both ends of the transmission line. Ifmotion of scene components occurs between frames of a sequence, it islikely that the reference patch which best improves the currentreference frame will be a patch extracted from an adjacent region of thepreceding image frame and these will therefore be taken from thesupplementary codebook. The entries of the supplementary codebooksearched for a given image patch may be restricted to those entriestaken from regions of the reference image adjacent to the image patchfor which the search is being conducted.

A further modification involves extracting reference patches from anauxiliary image and adding these to the codebook or to a supplementarycodebook. Such an auxiliary image may be captured when the videophone isfirst turned on and the phone caller is not yet positioned in front ofthe viewed background. These additional reference patches will betransmitted to the receiving videophone prior to commencing the phoneconversation. It will be appreciated that during a conversation,movement of the caller will continually cover and uncover areas of thebackground and that the compression algorithm will be able to find highquality matching reference patches, from the auxiliary image, for areassubsequently uncovered. The auxiliary image may be updated during theconversation, e.g. where `new` stationary objects are added. By markingreference patches with a flag, it is possible to substitute the actualbackground for a substitute background at the receiving videophone.Detection of motion in the background may also be used to compensate thecaptured frames for camera jitter.

What is claimed is:
 1. A method of compressing an array of data entriesin the form of digital electrical signals, the method comprising thesteps of:(a) providing a reference data array as a first approximationto the original data array to be compressed; (b) dividing the originaldata array into a plurality of blocks of one or more sizes; (c) for eachdata block determining according to a predetermined encoding process acompressed encoding of the data contained therein, where the compressedencoding can be uncompressed to provide an approximation to the datablock; (d) selecting from the plurality of compressed encodings that oneof the compressed encodings which, when uncompressed and added to thereference data array, gives the biggest improvement in the referencedata array relative to the original data array; (e) adding the selectedcompressed encoding in uncompressed form to the reference data array tothereby provide a second approximation to the original data array, andstoring the selected compressed encoding in compressed form to establisha compressed representation of the original data array; and (f)recursively repeating steps (d) and (e) for the remaining compressedencodings successively to select further compressed encodings to providereference data arrays which are successive approximations to theoriginal data array and to add to the compressed representations of theoriginal data array, until either a desired quality level of thereference data array is achieved or a maximum data size of thecompressed representation is achieved.
 2. A method according to claim 1,wherein the predetermined encoding process of step (c) comprisessearching a codebook of reference data blocks to find the closestmatching codebook reference block, each codebook reference data blockhaving a unique index assigned thereto, the compressed encodingcomprising the index of the identified codebook reference data block. 3.A method according to claim 1, wherein the predetermined encodingprocess of step (c) comprises determining a discrete cosine transform(DCT) or discrete fourier transform (DFT) for each data block, thecompressed encoding comprising the determined transform.
 4. A methodaccording to claim 1, wherein the selected compressed encoding whichresults in the greatest improvement in the reference data array isidentified by: determining the error between each of said data blocksand the respective compressed encodings in uncompressed form to providean error set ε₁ ; determining the error between each of said data blocksand the respective corresponding blocks of the reference data array toprovide an error set ε₂ ; comparing corresponding entries of the errorsets ε₁, ε₂ and selecting that compressed encoding which maximises thedifference value between such corresponding entries.
 5. A methodaccording to claim 4, wherein the error of each of the error sets, ε₁,ε₂ is the total squared difference between respective entries of thecompared data blocks.
 6. A method of compressing a two dimensional imagewhich comprises an array of pixels, each having a pixel intensity valueor values, using the method of claim
 1. 7. A method according to claim6, wherein the data blocks comprise patches of contiguous pixels.
 8. Amethod according to claim 7, wherein the image is divided on two or morelevels to create several sets of overlapping and/or differently sized orshaped patches and where the compressed encodings derived foroverlapping patches are combined in uncompressed form in the referenceimage using a predetermined criterion.
 9. A method of transmitting videopictures between a pair of videophones using the method of claim
 6. 10.A method according to claim 9, wherein the reference data array whichforms said first approximation is an arbitrary image which is stored ina memory of the transmitting videophone and in a memory of the receivingvideophone, the method comprising the steps of:(1) carrying out steps(a) to (e) for a first image frame captured by the transmittingvideophone, wherein said compressed representation is stored in a buffermemory of the transmitting videophone; (2) transmitting the compressedrepresentation to the receiving videophone; (3) carrying out steps (a)to (e) for a second image frame captured by the transmitting videophone,where the modified reference image obtained in step (1) as a secondapproximation to the original data array provides the reference imagefor step 3(a); (4) transmitting the resulting compressed representationto the receiving videophone; and (5) repeating steps (3) and (4) for thethird and subsequently obtained image frames captured by thetransmitting videophone.
 11. A method according to claim 10, comprisingthe steps of:receiving and decompressing the first transmittedcompressed representation at the receiving videophone; displaying thedecompressed representation and storing it as a reference image in amemory of the receiving videophone; and updating the reference imagewith each subsequently received compressed representation and storingand displaying the result.
 12. A method according to claim 10 whenappended to claim 2 and comprising temporarily adding patches extractedfrom the final reference image obtained for each frame to the codebook,or to a supplementary codebook, prior to conducting a search of thecodebook(s) for each succeeding frame.
 13. A method according to claim12, wherein the codebook search is restricted to patches extracted fromareas of the reference image of a preceding frame which are adjacent tothe image patch for which a search is being conducted.
 14. A methodaccording to claim 12, comprising storing in the codebook patchesobtained from a plurality of frames directly preceding the frame to becompressed.
 15. Apparatus for compressing an array of data entries inthe form of digital electrical signals, the apparatus comprising adigital computer, for example of known type, set up to run a programembodying the method of claim 1.